TACKLING IMBALANCED CLASS IN SOFTWARE DEFECT PREDICTION USING TWO-STEP CLUSTER BASED RANDOM UNDERSAMPLING AND STACKING TECHNIQUE
نویسندگان
چکیده
منابع مشابه
Software Defect Prediction for High-Dimensional and Class-Imbalanced Data
Software quality and reliability can be improved using various techniques during the software development process. One effective method is to utilize software metrics and defect data collected during the software development life cycle and build defect predictors using data mining techniques to estimate the quality of target program modules. Such a strategy allows practitioners to intelligently...
متن کاملLearning from Imbalanced Data Using Ensemble Methods and Cluster-Based Undersampling
Imbalanced data, where the number of instances of one class is much higher than the others, are frequent in many domains such as fraud detection, telecommunications management, oil spill detection and text classification. Traditional classifiers do not perform well when considering data that are susceptible to both within-class and between-class imbalances. In this paper, we propose the ClustFi...
متن کاملA Novel Approach for Handling Imbalanced Data in Medical Diagnosis using Undersampling Technique
In many data mining applications the imbalanced learning problem is becoming ubiquitous nowadays. When the data sets have an unequal distribution of samples among classes, then these data sets are known as imbalanced data sets. When such highly imbalanced data sets are given to any classifier, then classifier may misclassify the rare samples from the minority class. To deal with such type of im...
متن کاملCluster-Based Image Segmentation Using Fuzzy Markov Random Field
Image segmentation is an important task in image processing and computer vision which attract many researchers attention. There are a couple of information sets pixels in an image: statistical and structural information which refer to the feature value of pixel data and local correlation of pixel data, respectively. Markov random field (MRF) is a tool for modeling statistical and structural inf...
متن کاملStacking Class Probabilities Obtained from View-Based Cluster Ensembles
In pattern recognition applications with high number of input features and insufficient number of samples, the curse of dimensionality can be overcome by extracting features from smaller feature subsets. The domain knowledge, for example, can be used to group some of the features together, which are also known as “views”. The features extracted from views can later be combined (i.e. stacking) t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Jurnal Teknologi
سال: 2017
ISSN: 2180-3722,0127-9696
DOI: 10.11113/jt.v79.11874